Lake Mead Change Over Time
This week’s assignment will encompass the following concepts covered in Class 11 lecture & lab:
Landsat & Sentinel Satellite Programs
Multi-spectral Imagery
Band Combinations
Classification Methods - Supervised vs. Unsupervised
4 essential satellite image resolutions
Class 11 Lecture and Assignment Materials:
Vector Features for Assignment 11
2 Landsat .tar Packages Representing Year 2000 and 2021 Scenes
Note: the assignment directions feature image product acquisition. Section Part I - Step I - Data acquisition can be minimized by downloading the C11 Assignment Data Backup - Raster Packages which features all image inputs for the assignment. However, it is Strongly Recommended to work through the Data Acquisition steps with Earth Explorer in order to gain experience with the Landsat File Acquisition.
The class 11 quiz (Quiz 9 - Satellite Imagery) will cover only content from the Essentials of Geographic Information Systems textbook as follows:
Assignment Introduction:
The following assignment utilizes a supervised land cover classification method to determine contemporary extent reduction at Lake Mead, a critical water resource for multiple counties and states in the western US. The area of concern is located at an intersection of Arizona and Nevada states, slightly east of Las Vegas. Here water storage within the reservoir is near capacity in 2000, severely dwindling in extent over the next two decades. While the reservoir is measured in terms of total capacity as a factor of elevation and extent, through remote sensing we can access the spatial extent of the waterbody over time. In this way, we can gauge the locations within the reservoir that a diminishing over time.
‘Bathtub’ rings at Lake Mead - Evidence of Severe Reservoir Water Loss
Through the assignment classification process, land cover change between the two original input images can be determined. For this analysis, 2 Landsat images from 2000/08/07 - 2021/08/09 (two images) will be utilized.
The general framework for a image analysis within QGIS can be summarized as follows:
Input project datasets
Composite bands to create banded products
Classify banded imagery utilizing an Supervised Classification regime.
Run Change detection to ascertain land cover difference/change across time.
Quantify change and produce cartographic output.
Both the Landsat and Sentinel programs can be accessed freely and easily through web interfaces, API scripting and a collection of tools (plugins) for QGIS. In this assignment, imagery will be acquired via a web interface. Two of the most prominent web interfaces are as follows:
NASA’s Earth Explorer - https://earthexplorer.usgs.gov/
Earth Explorer Interface
Sentinel Interface
Note: regardless of the interface or product, its important to keep in mind that the acquired imagery is required to be cloudless (or as cloudless as possible) across both time periods, within the analysis area. Any clouds present will obstruct accurate analysis where those clouds exist in the product. While there are several ways to correct for cloud coverage, in the following workflow clouds will simply negate accurate analysis where they exist. This is an acceptable condition as the area of concern (analysis area) is itself relatively cloud free. Where there are clouds, they will be highlighted in the assignment steps below.
For this assignment, utilize Earth Explorer. First, register and create a user login via the following link:
Once complete, navigate to the Earth Explorer interface as a logged-on user. The first criteria will be location expressed as a polygon upload. In the C11 Assignment Data Backup - Vector Features, upload as shapefile the zipped folder entitled bounding_geometry.zip. This represents the bounding box for the vector feature lakebnds.shp which is the USGS feature for the known and authoritative extent of the Lake Mead Reservior. Make the first entry under Search Criteria in the Earth Explorer Tool Interface at left:
Upload bounding_geometry.zip to Earth Explorer
Data Sets tab. The products are 1 Landsat 7 scene and 1 Landsat 8 scene. You will need to expland the catalog tree to get to the correct products:Landsat 7 scene + 1 Landsat 8 Product Selection
The first scene is as follows under the Landsat 7 ETM+ C2 L2 tab:
ID: LE07_L2SP_039035_20000807_20200918_02_T1
Date Acquired: 2000/08/07
Path: 039
Row: 035
Page Location: 143 of 151
Landsat 7 ETM+ C2 L2
The second scene under the Landsat 8 OLI/TIRS C2 LS is as follows:
ID: LC08_L2SP_039035_20210809_20210819_02_T1
Acquisition Date: 2021/08/09
Path: 039
Row: 035
Page Location: 2 of 59
Landsat 8 OLI/TIRS C2 LS
download options (small green downward icon) first for the Landsat 7 scene, followed exactly by the Landsat 8 scene:download options
download options, you will be downloading the full bundle accessed by the top button for both the Landsat 7 and Landsat 8 products. This will result in .tar compressed packages that you will place into your assignment working directory:Product Bundle Download
.tar file extensions). Save a QGIS project into the folder for the assignment directory that you make. Further, make several folders inside the project folder - these will hold outputs that are derived during the assignment. The following assignment directory structure is recommended:Example Assignment Directory
c11 Assignment Vector Features
classed
2 .tar for the Landsat Scenes
processing
training
The two Landsat tile scenes are downloaded as .tar compressed folders. This type of compressor requires 7 Zip on Windows, or Keka on a Mac to uncompress. In Keka, extraction will produce an open directory ready to access. 7-zip, however, may produce an intermediate .tar compressed file. The .tar must then be extracted again to gain access to the root uncompressed data. In the end the directory image should look as follows:
Landsat Scenes Uncompressed in Directory
LE07_L2SP_039035_20000807_20200918_02_T1 directory first will show the Landsat bands 1 - 7 that will be utilized in QGIS. Note that band 6 is skipped:Import bands for the Landsat 7 Scene
Note: that the naming convention of the tile set will denote the product type and date, broken down as follows using the 2000 tile set as example:
LE07_L2SP_039035_20000807_20200918_02_T1
LE07 = Landsat 07
039035 = path 039, row 035
20000807 = year 2000, month 08, day 07
Note: the
bands#.tiffiles - the files that hold the raster cell values - end differently than other sidecar files of the product. The sidecar files are helpful metadata for other analyses that can be done with Landsat imagery. For this assignment, the focus will be the bands themselves; theSRreference denotes Surface Reflectance values.
There are several ways to created multiband, multispectral imagery in QGIS. In this assignment, we will rely on the Build Virtual Raster tool to perform this task as well as the supervised classification process.
Search for the tool in the processing toolbox after importing all bands into QGIS:
Bands Imported to Layers Panel
Search and Load Build Virtual Raster Tool
Populate the Tool; Make Sure Separate Bands is checked ON
Review Result Virtual
Virtual raster is successfully produced, this raster is now exported to the processing folder as 2000_composite`:Export as GeoTIFF
LC08_L2SP_039035_20210809_20210819_02_T1 Landsat 8 Scene. Note that the band array includes Band 6 for Landsat 8:Landsat 8 Bands
2021_composite to the processing folder.2000 and 2021 Composites
In the 2000 image, there is a clustered cloud cover at the right on the image, as well as a small portion over the center of Lake Mead. Ideally these clouds would not exist. As they are relatively minor, we will proceed with the image, and accept that we cannot accurately classify water value pixels where there are clouds.
Colors across both composites are not consistent. This will be corrected in the next step where the an appropriate band combination will be applied first to Landsat 7 2000, then Landsat 8 2021 scenes.
Landsat 7 compared to Landsat 8 Band Combinations
Landsat 7 - 4/5/3 Combination
Landsat 8 - 5/6/4 Combination
Note: Band combinations are a property of raster symbology. They can easily be reapplied as needed. Typically if the layers are saved with the combinations applied, the combinations will persist. However, if needed in the future, reapplication of a band combination is a quick, easy step as noted above.
With the band combinations complete and visualized to accentuate land cover differences, QGIS will be utilized to class pixels from continous values across the multi-band product into discrete, classes or ‘buckets’ representing land cover types. The classification algorithm must be trained to know which pixel combinations to place in one discrete class versus another discrete class. As much as this process is quantitative, there is an ‘art’ to classification for a supervised classification run. Here the analyst has jurisdiction over what pixel types will go into which land cover classes.
To Start, install the plugin - dzetsaka classification tool plugin:
dzetsaka classification tool plugin; the current stable release is 3.70 - March, 2021
dzetsaka Icon in Main Menu
Navigate to the processingfolder and load in the 2000_composite raster into the Layer Panel if it is not loaded currently. Make sure the 4/5/3 band combination is applied as it will be necessary for the classification steps below.
Next, a training shapefile will be created. This is a task that we have not done yet in this course; however, it not hard. Navigate Layer > Create Layer > New Shapefile layer:
New Shapefile Layer
The tool will be filled out with the following parameters:
~training/2000_training.shppolygonProject CRS: EPSG:32621 - WGS 84/UTM zone 11N (this is the current project projection system set from both rasters. Landsat Scenes will always come projected with their appropriate UTM Zone. Analysis should proceed in this map projection.)Class, insert it into the Name field, and click Add to Fields List.Add Class as Text Field; Save 2000_training.shp
.shp is now placed into the Layer Panel. Toogle ON editing via the digitizing toolbar located in Main Menu:Edit ON, Digitizing Toolbar
Add Polygon icon (green blob icon) and zoom into the water feature in the center of the image. The first class that will be captured will be ID 1 for water.Capturing the First Polygon for Water Features
Note: Make sure you are training the 2000 composite with the 2000 training, not the 2021 composite which will be done following the 2000 composite imagery.
Completed Training Polygon
If a Mistake is Made, Remove it from Attribute Table, then Save.
1 for the ID and water for the class. Make sure to capture lighter colored blues that are clearly water, but avoid capturing any shoreline pixels. Capture approximately 10 polygons of various sizes both in the center of the water body but also along shoreline areas. The goal is to get a representative sampling of all pixel combinations that represent the water extent of Lake Mead in the 2000 imagery. In the following image a polygon is created next to the shoreline, but not over the shoreline:Classing near, but not on, Shoreline
1 and a class name of water. Next, move to the 2 class which is land. Here the precision of the polygons is not as important; the goal is not to break down land types, but to set of everything that is land as separate from water. Capture land areas that are close to the shoreline; we will mask the imagery later in the process, and we will only keep those areas that are either water or areas that are not water, but close to the shoreline.Again, create approximately 10 polygons, but not more - 10 should be more than enough for the land class. Check the attribute table and the results:
Training Polygons Complete for 2000 Composite
Completed Training Attribute Table
Note: for the clouds in the far right of the 2000 composite image, class those white, light purple pixels as
land, i.e. we will not include them as the primary water class. Also note that some of the polygons in the example image above are not particularly close to the shoreline; these are not the best training locations as we will mask the imagery before classification. A better approach is to keep all training in and close to the Lake Mead water extent.
processing as 2021_training.shp. Replicate all the steps taken for the 2000 composite, but creating training polygons within the 2021_training.shp based on the 2021 composite imagery, NOT the 2000 composite imagery.2021 Training Setup
Critical Note: 2010 training is for the 2000 composite; a separate training is done for the 2021 composite, named
2021_training.shp. Make sure to not mix up the trainings - one for 2000 and then one for 2021, based on the respective imagery.
Training Complete
With the training complete, the classification can begin. However, we want the imagery to be masked as close the feature of interest - the Lake Mead water extent. As we are not interested in land cover outside the water extents, we can remove these geographies from classification through a masking process. This will dramatically speed up classification processing as the classification algorithm does not have to evaluate pixels outside the masked area.
To Start, input the lake_mead_bounding_box.shp into the project (unzip bounding_geometry.zip first):
Masking Geometry
Mask Raster - Processing Toolbox
Mask Parameters - 2000
Mask Parameters - 2021
Save 2000_composite_masked.tif
Save 2021_composite_masked.tif
Imagery Masked to Lake Mead Water Extent
Navigate to the dzetsaka classification tool and populate for the first run - 2000 imagery:
Dzetsaka Classification Tool - 2000 Imagery Population
K Nearest Neighbors. The classifcation will run on the id variable - 1 for water, 2 for land. The tool will run for upwards 20 minutes, but may be as quick as several minutes depending on the processing capacity of your machine. Once complete, it will place the classification results in to the layers panel.Note: If you are unable to select the
K Nearest Neigbhorsmethod, or it asks for the scikit-learn python library, you have two choices:
- Stay with the default, built-in
Gaussian Mixture Model. The upside is that you don’t have to do anything further, and this methods runs quickly. The downside is the resulting classification will likely be slightly less accurate to the water body extent than theK Nearest Neigbhors.
- Install the python library necessary for the
K Nearest Neigbhorsmethod to run within the dzetsaka tool. In the section scikit-learn install located at the end of the assignment, this extra installation step covers installation for both Windows and Linux/macOS.
You can symbolize as Paletted/Unique values, and then compare the classfication results to the original imagery. A good classification should clearly demarcate water from the shoreline, and land areas:
IPaletted/Unique values Applied
Classed Result
Save the QGIS project. Repeat all the steps that were just done for the 2000 masked imagery but now using the 2021, using the respective training set for 2021, NOT the 2000 training set. Once complete, export both classification results to the classed folder as 2000_classed.tif and 2021_classed.tif, respectively.
Clear the workspace and keep just the 2000_classed.tif and 2021_classed.tifrasters and save the updated project QGIS.
Next the difference between the two rasters will be calculated. There are several ways to approach the difference problem. As is, there are several values in each classed result - 1, 0 and 2 representing the original id values, and then also values along the edges of the image that contain neither values.
Possible difference combinations include the following:
We are interested in the condition -1 where there was water in 2000 but became land in 2021. This represents the diminishment of the Lake Mead water extent over the past two decades, largely due to climate change dynamics that have severely stressed this water resource.
Before running a difference function, zoom in and compare your classification results. Your 2021 classed raster should show a water extent that is definitely diminished from the 2000 classed raster:
Classed Differences Across 2000 > 2021
Open the raster calculator and form the following equation, exporting the raster result as 2000_2021_difference:
"2000_classed@1" - "2021_classed@1"Save 2000_2021_difference to classed folder
-Review the result. Note the 1 value; this is the cloud condition over the water extent. We will disregard this class. The focus for the further analysis will be the -1 value:
Classed Difference Result
-1 value will be isolated by setting all export values except -1 to no data. Export the 2000_2021_difference to difference_final in the classed folder, and populate the no data parameters as follows:Exclude Values from Final Result
no data vs the -1 value:Final Classification Values for Water Extent Difference
Organize a new workspace and save c.11.assignment.cartography with the following layers broken into two layer groups - main.map.frame and inset.frame:
difference_final
ESRI Satellite base or another base layer accessed from the QuickMapServices plugin
lake_mead_boundary_box
cb 2018 us states
Final Cartography Organization
At this juncture, most of the elements needed for the final map layout are ready. In the concluding section below, statistics are developed so that deforestation area and percentage can be included in the final map design. The final map design should include the following:
Landscape or portrait orientation map sheet (your choice)
Main map frame featuring the difference pixels atop quickmapservices basemap imagery.
An inset map denoting location of the project area.
Titling that explains the location, date range and thematic purpose of the map.
Simple legend for the difference class (-1).
Source the data as follows: Landsat-8 and Landsat-7 analysis imagery courtesy of the U.S. Geological Survey
See the following layout example for basic design positioning suggestions:
Assignment 11 Draft Layout
In the example above, the source tag has not been included. The projection for the inset map has been altered to preserve a horizontal orientation for the inset map. The main map frame utilizes the UTM Zone 11N for the scale bar. The lake mead bounding geometry was used explicitly for the inset map to show the main map frame extent. Remember, we can avoid this by using the Overview insert capacity for map inset in Map Layout.
As detailed in the section above, the map product will feature the water surface loss in the map layout. Of course this is helpful to conveying the spatial distribution of main map theme - water loss. However, we don’t have a quantitative areal or percentage expression. This can be achieved easily, and incorporated into the final map design as a text explainer or developed within the legend.
Save the c.11.assignment.cartography QGIS project and open as new project and import just the classed layers:
difference_final
2021_classed
2000_classed
Classed Raster Layers
Raster Report
First, place the 2000_classed raster into the tool parameter, and run as a temporary report. This will produce an html report in the lower right of the Map Canvas window.
This will produce a report.html document which can be accessed and seen as follows:
report.html
report.html
This tells us the total square area (squared meters) of the 1 value in 2000 is 549405900 m² or 212 mi².
To express units as square miles, the following equation can be utilized:
mi² =m² * 0.00000038610Next, run the same tool process on the 2021_classed raster. In the example, the 1 value is found to be 303937200 m² or 117 mi². The difference between the 1 value 2000_classed - 2021_classed will be similar as -1 in the difference_final, accounting for the condition where clouds that were present over water in the 2000_classed layer where removed from consideration as a water value. This discrepancy should amount to a small fraction of the total extent in 2000. In the example classification, it accounts for 711000 m², or 0.27 mi².
In the end, the difference_final extent is found to be 246179700 m² or 95 mi². This is the amount of water surface extent that has been displaced since 2000.
The loss of 303226200 m² or, again, 117 mi², constitutes a 44% percent loss in surface extent from 2000 to 2021, based on the following formula:
Original Value(water 2000) - Final Value(water 2021) / Original Value(water 2000) * 100
For the final cartographic design, consider incorporating your statistics into the final map layout to help contextualize the thematic purpose of the map - surface water extent loss 2000 - 2021 at the Lake Mead Reservoir.
As part of the QGIS install, a Python installation also occurs in the background, and a series of Python Libraries are also installed. However, scikit-learn is not one of them. This Python library features a series of cluster algorithms, including k-Means. When the dzetsaka tool calls on the K Nearest Neigbhors method, it relies on this library. To start, the installation for a Windows environment is shown, followed by a Linux/macOS environment. The steps below are also shown in the guidance for the dzetsaka tool.
Windows Install:
For Windows, the first find and navigate to OsGeo shell which is a command line environment for the QGIS install:
Search for the OsGeo shell within the Windows Start Search
o4w_env to set up the environment:OsGeo shell
pip is called to install the scikit-learn package:python3 -m pip install scikit-learn -U --user
Type of the Command for the Pip Install
pip install, you may be reminded to upgrade to a newer version; you don’t need to do that if the scikit-learn package is successful:Install Log within the Shell
Restart your machine, open QGIS and navigate back to your project and the K Nearest Neigbhors method should now run without complication.
Linux/macOS
For Linux/MacOS, the first find and navigate to terminal on your machine which should be located in Utilities:
macOS Utilities
Terminal Icon
python3 -m pip install scikit-learn -U --user
Install scikit-learn via Command Line
Sucessful Install
K Nearest Neigbhors method should now run without complication.